Data-driven system identification of the social 
network dynamics in online postings of an extremist 


group 


Alejandro R. Diaz, Jongeun Choi 


Mechanical Engineering 
Michigan State University 
East Lansing, MI 48824 
diaz@egr.msu.edu, jchoi@egr.msu 


Abstract—Terrorism research has begun to focus on the issue 
of radicalization, or the acceptance of ideological belief systems 
that lead toward violence. There has been particular attention paid 
to the role of the Internet in the exposure to and promotion of 
radical ideas. There is, however, minimal work that attempts to 
model the ways that messages are spread or how individual 
participation in radical on-line communities operates. In this 
paper, we present a stochastic linear system to represent the 
evolution of contribution to a sample of 126 threads in an on-line 
forum where individuals discuss radical belief systems. To estimate 
or predict the time-varying contributions of agents for given online- 
forum data, each agent’s contribution has been modeled as a state 
variable. We then use the  expectation-maximization (EM) 
algorithm to identify the model parameters including the adjacency 
matrix of the graph constructed among participating agents along 
with measurement and system uncertainty levels in online-postings. 
Our approach reveals the identified dynamical influences among 
agents in the time-varying shaping of the contribution in a data- 
driven fashion. We use the real-world data from online-postings to 
demonstrate the usefulness of our approach, and its application 
toward on-line radicalization. 


Keywords— cyberbterrorism; radicalization; Kalman filter; EM 
algorithm; forecasting; maximum likelihood estimation. 


I. INTRODUCTION 


Terrorism research has dramatically expanded in the last two 
decades, with 90% of the publications associated with this 
field being generated after the 9/11 attacks in the United States 
[1]. There has been particular growth in empirical 
examinations of the foreground and situational dynamics that 
may lead individuals to engage in ideologically motivated acts 
of violence, which is also referred to as a radicalization 
process [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]. Much of this research 
is driven by continued growth in both domestic and 
international terror groups and movements, as well as 
continued high profile terror attacks [2, 11, 12, 13]. 

There is also substantial concern over the ways that terror 
and extremist groups use social media and the Internet as a 
vehicle for recruitment [11]. There have been several notable 
cases of individuals in the US being contacted by members of 


978-1-5090-6096-2/16/$31.00 ©2016 IEEE 


Thomas J. Holt, Steven Chermak 
School of Criminal Justice 
Michigan State University 

East Lansing, MI 48824 
holtt@msu.edu, chermak@msu.edu 


Joshua D. Freilich 


Department of Criminal Justice 
John Jay College, CUNY 
New York, NY, 10119 
jfreilich@jjay.cuny.edu 


al Qaeda and ISIS living abroad to recruit them into the 
movement [2, 15]. The threat of domestic radicalization 
occurring without any physical contact with foreign extremist 
movements is relatively unparalleled, and calls to question 
how the Internet may engender acceptance of radical belief 
systems without physical interaction or integration into larger 
actor networks [2, 11]. 

The Internet creates a source for ideological information 
sharing 24 hours a day, from any part of the globe, facilitating 
equal access to radical messages and networks where 
individuals may gain entrance to extremist groups [16, 17, 18, 
19]. Social media sites like Facebook and Twitter enable 
individuals to identify others who self-identify as members of 
radical movements to freely discuss their views with little fear 
of reprisal [11, 20]. In addition, individuals use the Internet to 
identify others in geographic proximity, or arrange meetings 
and events in the real world [11]. In much the same way, 
digital media allow individuals to create videos and magazines 
that can be posted and shared in various social media outlets 
like YouTube [21]. Finally, posting videos and news stories 
through social media provide another mechanism to publicly 
refute claims from the media and government and to portray 
their own group in a positive light [21, 22]. 

As a result, the Internet is now a critical resource that 
terror groups use to maintain their presence and growth over 
time [11, 23, 24]. This is true for all forms of extremism and 
terror, whether white nationalists and others associated with 
the Far Right, or jihadist movements globally [21, 25]. The 
Internet's on-demand nature means that individual exposure to 
radical ideas and content is limited only by their willingness to 
spend time on-line. 

One key avenue to explore radicalization may be through 
web forums, which have been a long-standing form of 
asynchronous on-line communication used by deviant [26, 27, 
28] and non-deviant groups alike [29, 30, 31]. Forums act as 
online discussion groups where individuals present issues or 
discuss problems [26, 32, 33, 34]. They are composed of 
“threads” which begin when a registered user creates a post 
within a forum, asking a question or making a statement. 


Other people respond to the remarks with posts of their own 
that are connected together to create threads. Thus, threads are 
composed of posts that center on a specific topic under a 
forum’s general heading. Since posters respond to the ideas of 
others, the exchanges present in the strings of a forum 
demonstrate relationships between individuals [32, 33, 35]. 
These connections can serve as a key data source to 
understand the process of social engagement within any 
subculture [32, 33, 35]. In turn, posting behavior may 
demonstrate the ways that individuals become enculturated 
into an extremist group and relate to others online.. 

The domestic far-right in the US has long used the Internet 
and forums in particular to communicate with followers, 
recruit new members, and propagandize. In the early 1990s, 
public officials, watch-groups, and scholars noted that 
technologies were critical to the growth of the far-right militia 
movement [36, 37, 38, 39, 40]. While the militia movement 
declined in the late 1990s, it has recently surged, partially due 
to online discussion forums and email distribution lists [41]. 
Similarly, the racist far-right online site Stormfront.org was 
established in the 1990s [42, 43] and has attracted a massive 
user population and financial support from posters who 
directly fund forum operations and hosting [44]. In fact, 
Stormfront.org has emerged as a model for many groups that 
have resulted in “virtual communities of hate” [45, 46]. 

Forums provide individuals with a variety of information, 
such as justifications for the use of violence, and calls for 
attacks against general categories of enemies (e.g., the New 
World Order, the hedonistic West, the Zionist Occupied 
Government, the police, the Jews, etc.) [42]. The importance 
of online identities of hate groups was noted by [47] who 
found that the vast majority of groups involved in 
ideologically-motivated violence had an Internet presence. 

The more time individuals spend in websites and social 
media may increase the quantity of messaging individuals and 
also enculturate them into social networks that espouse 
violence or radical ideas. Time spent in forums can also 
provide individuals with information on the argot and outward 
symbols of movement membership. Members of the white 
supremacist community demonstrate their identity through the 
use of linguistic indicators such as, 88 (standing for HH, or 
Heil Hitler), or phrases such as the 14 Words of David Lane 
[11, 48]. The use of text-based identifiers is key since the lack 
of physical cues in on-line spaces require individuals to 
demonstrate their group affiliation in some fashion [26]. 

There have, however, been few investigations of the 
network structures and posts made within forums used by 
members of extremist groups. One of the few studies of this 
nature assessed the content of communications in jihadist 
websites [49]. Evidence suggests that most forum discussions 
do not last long, involve a limited number of participants, and 
the posted entries were short in length. [49] also examined the 
types of posts, such as whether religious sources were 
referenced or if there was encouragement to commit jihad, 
instructions regarding military training, or social interactions. 


This analysis, however, gave no information on the 
relationship between online posts and ideologically motivated 
violence committed by supporters of these forums and 
organizations. 

As a result, the publicly accessible online activities of 
members of extremist movements may provide signals of the 
radicalization process in action. Posts in web forums provide 
direct, overt information on the relationships between actors in 
larger networks [26, 50]. Combining network analyses with 
the content of posts made by individuals will provide direct 
information on the role that posts of ideological rhetoric and 
information play in larger networks of extremist communities. 
Those who constantly post in forums may be more pivotal 
actors within a network, serving as hubs to link individuals 
and facilitate the flow of information. Such information 
would be invaluable to identify the flow of radical messages 
from group to group, and understand how posting habits and 
participation in multiple forms of CMC shape that person's 
role in the prospective radicalization of others. This data will 
not, however, speak to connections that may emerge between 
participants through private messages made via email or other 
mechanisms [32, 33]. Thus, this analysis can only speak to the 
public process of radicalization, which occurs in tandem with 
direct messaging between individuals in and outside of 
extremist groups [21, 23, 24, 48, 49]. 

Few researchers have considered this issue, let alone 
examined the network structures that underlie extremist 
movements online. [9] and [2] studied the radicalization 
process of al Qaeda-influenced terrorists and found that 
individuals became radicalized in clusters that were created in 
part through peer associations. Few have considered how 
network structures of extremist groups function to connect 
members together and prospectively radicalize individuals to 
violence. It is unclear if members who frequently post content 
are central actors in larger networks of extremists, or if they 
are simply producing a greater degree of noise relative to a 
smaller proportion of individuals who are tied to multiple 
groups yet are infrequently online. Further, there is minimal 
knowledge of the ways that far-right extremist groups with 
members living in close physical proximity are linked to other 
groups through the web. Such information is invaluable to 
identify ways to disrupt networks that provide messages that 
lead to radicalization, based on the presence of redundant 
relationships, resiliency of message delivery, and network 
density [51]. 

At the same time, close to 99% of persons exposed to 
radical messages never actually engage in violence themselves 
[52]. Thus, it is critical to understand how individuals are 
socialized into radical or extremist movements, and how 
individual participation in on-line movements changes over 
time. This study considered these questions using a set of 
threads collected from a nationally-known forum used by 
members of the Far Right to connect and discuss various 
issues. 


To understand the social dynamics in such online-postings, 
we present a stochastic linear system to represent the 
evolution of the contribution in online-postings. To estimate 
or predict the contributions of agents for given web-forum 
data, each agent’s contribution has been modeled as a state 
variable. We then use the expectation-maximization (EM) 
algorithm to estimate the model parameters including the 
adjacency matrix of the graph-constructed among participating 
agents along with measurement and system noise processes in 
online-postings. Our approach reveals the estimated 
dynamical influences among agents in the time-varying 
shaping of the contribution in a data-driven fashion. We use 
the real-world data from the major Far Right web forum to 
demonstrate the usefulness of our approach. 


II. BUILDING A GRAPH THEORETICAL MODEL FROM FORUM 
Post DATA 


The data set was collected from a single sub-forum of a 
major Far Right web forum. All 126 threads made in that 
forum, consisting of posts from 46 agents over a period of 
about 1500 days, were downloaded by the researchers and 
saved as complete web pages in order to maintain the posts in 
their original state. A thread is composed of posts, whereby 
an individual creates an initial message that is made available 
for others to see within the forum. Responses to that initial 
post create a thread, which resembles other forms of naturally 
occurring communications, particularly group discussions [33, 
34]. Since anyone who is a member of the forum can read a 
post made in a thread at any given time, it creates the 
opportunity to understand the influence of asynchronous 
communication on the transfer of information, attitudes, and 
beliefs on others. 


The posts from each of these threads were used to 
construct a network that represents relationships among agents 
and to assess the level of engagement of each agent in that sub 
forum and its evolution over time. Data are organized in 
threads T ={T",...,T”}. Each thread is a collection of data 
structures containing information about each posting in the 
thread, including the author’s (agent) identifying handle, time 
of posting, and content (content is not used in this work). In 
particular, let 


H® ={h®} and T? = {0%}, a=l,...,.N 


for threads o=1,...,N where 


e h; identifies the unique handle identifying the author 
of the k-th posting (note that, as the same individual 


may post more than once in the same thread, H“ 
often contains repeated entries) and 


e 7, stores the time of posting, in number of days from 

a reference datum. 
We use the author lists H% to represent the data in a 
graph-theoretical framework. To this end, from each thread @ 


and author list, we build a subgraph G* =(V°%,E*) with 
vertices V* and edges E“ by letting 


o=1,2 and 3 


(a) Three representative sub-graphs G” , 


(b) Complete graph G 


Fig. 1. Graphs associated with individual threads and complete graph 


e Vertices V“ in G* correspond to agents in H® 
(removing duplicates), i.e., V* ={v:ve H*}. 
e G* be a complete subgraph, ie. (i, j)¢ E* if 
ie V® and jeV® 
As agents often participate in several threads, sub-graphs 


often share vertices and edges. The complete data set is 
represented by the graph G=(V,E£) where 


V 


i 
— 
3 

g 
5 
a 
by 

| 

— 

py 

g 


g 
uF 


a= 


To illustrate, let the data be a collection of 16 postings 
from seven agents with handles {vl,v2, v3, v4, v5, v6, v7} in 
three threads: H'={v1, v2, vl, v3, vl, v2} , H’={v2, v3, v7, 
v2, v5, v2, v3} and H*={v5, v6, v5}. For these data 


V '={v1,v2,v3} , V7={v2,v3,v4,v5,v7} and V 7={v5,v6 }. 


The sub-graphs G',G’ and G® associated with these data are 
shown in Fig. l(a), along with the complete graph G (Fig. 
I(b)). 


In building the graphs G® associated with the various 
threads we have made a number of important assumptions. In 
particular, 


e Agents are connected if and only if they have 
contributed to the same thread. 

e Agents will influence each other’s engagement 
directly only if their vertices are connected in the 
graph. 

e Graphs are undirected. This means that if (a, b)EE, a 
is as likely to influence b’s engagement as J is likely 
to influence a. 


e Even though agents contribute their postings at 
different times, we assume that connections between 
agents exist for all times, i.e., edges in £ are not time- 
dependent. Note, however, that updates from EM 
algorithm do account for the time of participation of 
each agent and, implicitly, for the persistence of the 
connection between agents, at least in some time- 
averaged sense. 


Later, we will explore the implications of removing some 
of these assumptions and ways to address them. 


We use the information on the time of posting 777, along 


with the number of postings, to assess the degree of 
engagement of each agent in the conversation. This is 
described next. 


II. ESTIMATING EVOLUTION OF ENGAGEMENT FROM THE 
THREADS 


As mentioned above, network connections among agents 
are not likely to be formed all at once, as the forum's activity 
may span long time periods, possibly years, and agents may 
enter a thread for the first time at any time within this period. 
Nevertheless, we establish the network of connections among 
agents from the data, independently from the time at which the 
individual enters the thread. Time and number of contributions 
are used to estimate the time-evolution of engagement of each 
agent, as defined below. 


Let the first posting in the data occur at the reference time 
t =0 and the last one at 7, = max t . For simplicity, we 
a, 


max 


assume all time units are in days. We partition the total time 
[0, 7,0, | into n bins 6; of fixed size, indexed by ¢ = 1,...,n, and 


count the number of postings by each agent within each bin. 
For instance, we let 


b, =[t,,T,+A] and €, 


t+1 


=T,+d with d<A 


Fig. 2. Contributions by all agents over time. Those of agent 15 given a 
bin by are highlighted. 


Bin size A and increment d (both measured in days) can be 
selected to allow overlap (for data smoothing). With this 
construction, it is straight forward to extract from the data the 
quantity c(i,t), measuring the number of postings by agent i 
within bin 6; . 

Figure 2 shows contributions corresponding to data 
collected from 126 threads with contributions from 46 agents 
over a period of about 1500 days. The time of each posting 7 
is shown along the horizontal axis (measured from the first 
entry). In the figure, a contribution from agent 7is entered as a 
“x” displayed a distance 7 from the horizontal axis, placed at 
the appropriate time 7. Letting bio =[450,500] the plot shows 
that c(15,10) = 4. 

This model demonstrates that participation in the forum's 
threads is not evenly distributed. A small proportion of users 
are frequent posters, compared to the larger body of 
individuals making occasional posts in keeping with patterns 
of participation from research on deviant on-line communities 
generally [26, 34, 35]. This suggests that individuals who are 
frequent posters may have a greater influence on the tenor of 
discussions and potentially exert control over the radical 
nature of conversation. For terrorism research, there is value 
in identifying how individual participation changes over time, 
and if those who post more or less frequently have any change 
in their attitudes or beliefs. 


We associate higher values of c with higher levels of 
engagement and expect engagement to vary from agent to 
agent and, for a given agent, to vary with time. We discuss 
how we model the dynamic evolution of agent engagement in 
the text section. 


IV. KALMAN FILTER ESTIMATION FORMULATION FOR AGENT 
ENGAGEMENT 


In this section, we show how to model the evolution of the 
state as a stochastic dynamical system that is subject to 
measurement and process noises. The objective is two-fold. 
First to estimate the state x from the observations we 
formulate our model as a discrete-time stochastic dynamical 
system. Next, we perform system identification via the 
expectation-maximization (EM) algorithm [54] in order to 
estimate the social-network influences among agents as in the 
formulated stochastic system. 


At time ¢ let x, be a pxl1 vector representing an 
“idealized” measure of engagement of agents contributing to 
threads 7, described in the previous section. p is the total 
number of agents in the data and f1,2,...,1 represents the 
sampling time associated with bins 5,,5,,...,5,. We assume 
that scaling is possible so that 0<x,,<1, larger values 
representing more engagement. 


The state vector x; is not observed directly. Instead, we 
assume it appears in a random regression model of the form 


y,=Mx,+z,, 1, 2,..., (1) 


Xo is arandom process with mean wt and covariance X. In (1), 
y, is the measurement at time ¢, in our case extracted from the 


collected data c(i,t). Here y, itself measures engagement, 
and M is the pxp identity matrix. The actual measurement 
c(v,t) is as described in the previous section, namely, c(v,f) 
represents the number of postings by agent v in measurement 
interval b, We construct y, from c by assigning an evaluation 
of “full engagement at time ¢” (Ze, y=1l) whenever an 
individual contributes 6 or more postings within interval D,. 
Thus, we define for agent 7 
y,, =min(c(i,t)/ 6,1) 

i=1, 2,....p and f1,2,...,n. Measurements are subject to 
measurement error, v,. We assume z, is a pxX1 vector of 
random, uncorrelated errors with covariance matrix R and zero 
mean. 

Transition from x; to x;+; is modeled as a first order process 
along with a system noise process, 


(2) 


Here ® is a pXp transition matrix. The model in (2) is 


x, =@x,,+w,, 


subjected to random, uncorrelated normal distribution 
pxX\|vector w, with covariance matrix Q and zero mean. 


Matrix ® reflects the time-averaged interactions between 
agents that best fit the data set over the sampling period. An 
entry (i, /) gives a measure the strength of the connection 


between agents i and j. Initially, ® will be obtained from a 
linear model associated with the network graph G built from 
the thread data. This construction is standard, and it follows 
e.g., the work in [53]. Specifically, in a (deterministic) discrete 
time system associated with graph G, the protocol dictates that 


Xiu = C.%, (3) 
where 

C,=1-eL (4) 
Tis the pxp_ identity matrix and € is a scalar parameter, 


adjusted so that min|Z,| <1 where 4; is an eigenvalue of 


C,. The reader may note that this construction is typical in 


consensus problems with multi-agent systems [53]. In that 
case, L=D-A is the pxp Laplacian matrix associated 


with graph G; A=[a,] is the graph adjacency matrix with 

a, =1 if (i, j)e E , otherwise a, =0; and [D]=diag(d;,) 

with d, = is Setting the step size e¢ so_ that 
j 


€ <1/max(d,,) insures stability (see [53]). 


The transition matrix ® in (2) can be built from C, in (4), 
€.g., as 


® =C," for k>1 (5) 


For the given measurements the conditional 


Vise, 
mean x? := E(x, |y,,...,y,) , and the conditional covariance 


matrices 


P* = cov(X, | V5 Y,) 


t t. 


and F COVE, 4 Vea y) 
associated with this process can be obtained from the standard 


Kalman filter equations (see Appendix). 


Recall that the objective in this paper is to identify the system 
in order to smooth the state for given the whole data set, i.e., 


to find x” = E(x, | y,,....y,),the minimum mean square error 


smoothed estimator of x, based on the data [54]. 


V. SYSTEM IDENTIFICATION VIA EM ALGORITHM FOR 
TIME SERIES SMOOTHING AND FORECASTING OF AGENT 
ENGAGEMENT 


As discussed before, construction of the graph G relies on 
several assumptions which affect the accuracy of our 
estimation of the state vector x,. Rather than relying only on 


the transition matrix ® from G and a consensus model (the 
initial guess), a sequence of operators P(r), r=1,2,...will be 


constructed based on an EM algorithm by using the available 
data. This is discussed next. The approach follows the 
strategy presented in [54]. 


The process is started with an initial guess x,, a normal 
random vector with mean “ and covariance matrix 2. We 


start with pxp diagonal covariance matrices R=0, / , 
O=0,1 , and £=0,77 . The EM algorithm updates 


covariance matrices Q, R, and & until iterations stabilize, 
according to the strategy described in [54], using formulas 
summarized in the Appendix. 


Application of the EM algorithm seeks to smooth the state x 
based on the available data through an iterative update of the 
transition operator ® . Thus, while the initial transition matrix 
is built from a consensus model applied to graph G, this 
system identification process allows us to identify the model 
based on the data and account for the uncertainty levels 
associated with the social dynamics in online-postings. And 
while in this model the transition matrix is not time dependent, 
updates from EM algorithm do account for the time of 
participation of each agent and, implicitly, for the persistence 
of the connection between agents, at least in some time- 
averaged sense. 


VI. RESULTS 


This example illustrates the modeling strategy using data 
collected over a five-year period. There are a total of p=46 
participants in the subforum, identified by distinct handles, 
contributing to N=126 threads. Graphs associated with each 
thread were built as described in the previous section. A 
sample of 12 popular threads, as well as the complete graph, 
are shown in Fig. 3. Data regarding contributions over time 
displayed in Fig. 2 come from these threads. The data 
collected was organized in n=40 overlapping bins 5, of length 
A=50 and d=25, starting from t=500, from where y, was 


built. 6, = 0.2, O3= 0.2,0, =0.2 were used for the initial 


covariance matrices R=0, 1, Q=0, 1 , and L=0,'l , 


° ° ° 
° ° 
<* ° 4. o 4 88 


Fig. 3: Complete collective graph and a sample of 12 popular threads as 
sub-graphs from the data. 


respectively. The EM algorithm produced the finalized 


transition matrix ® and the smoothed states (orange dots) are 
plotted along with observations (blue dots) as shown in Fig. 4. 


VI. DISCUSSSION 


As can be seen in Fig. 4, the identified transition matrix ® 
allows us to smooth the state x successfully. By comparing 
both observations and the smoothed states, we conclude that 
they match well and so our scheme provides the reasonably 
identified system in a data driven way. 


Sub-graphs: As shown in Figs. 3 and 4, participation in 
threads is consistent among a small number of users who serve 
to connect otherwise individual isolates who post infrequently. 
This finding again supports the notion that those frequent 


posters may have greater influence on the discussions within 
the forum generally. 


The identified transition matrix: Our approach reveals the 
influence among agents in the estimated adjacency-like matrix 
in a data-driven way. Note that we start the EM algorithm with 
a simple consensus-type model initially, te, aj=1>0 
assuming that all connected agents are influencing each other 
with equal magnitude to promote to become similar. Without 
uncertainties and some mild conditions, the evolution (i.e., the 
consensus algorithm) will make the state entries to converge 
to the averaged value when the graph is connected [54]. 
However, the identified transition matrix show that the 
estimated aj and the associated adjacency matrix contains 
both positive and negative values, which contribute to 
promote similarity or dissimilarity simultaneously, among 
agents i and j, respectively. Positive and negative edge 
modeling has attracted many researchers in modeling social 
networks [55, 56]. We may view our findings as positive and 
negative edge modeling on online social networks in a 
dynamic system setting. This identified static adjacency 
matrix shows how a particular, e.g., agent 7 influences agent j 
over the evolution on average in a data-driven way. 


Prediction of the future: The system identified here can be 
also used to compute the prediction of the states in future time 
based on the data. It will be of great importance to gauge the 
prediction capability using a larger set of longitudinal data, 
which can be divided by two segments to predict the later state 
evolution in the second segment after identification based on 
the first segment. Furthermore, the computed covariance 
matrices can allow us to compute the confidence intervals for 
such predicted state variables. With the positive and negative 
contributions between each agent, we found that the transition 
matrix is unstable, i.e., there exists an eigenvalue outside of 
the unit circle. In this case, the confidence intervals will 
increase as the prediction time increases. 


Further refinement on model parameterization: We may 
constrain any model parameters according to scientific reasons 
in the maximization process in the EM algorithm. In contrast 
to identify a time-averaged transition matrix over the whole 
longitudinal data, we may seek for a time-varying transition 
matrix. In fact, by incorporating the sub-graphs in Fig. 4 in 
timely manner, we may find further time-varying evidence 
among the social network interactions. To synchronize sub- 
graphs as shown in Fig. 4, with our transition matrix, we may 
parameterize it as a time-varying one, which will increase the 
number of unknown parameters proportionally to that of 
integrations. The number of model parameters can be reduced 
in this case, for example, if agent i does not have chance to 
read the post by agent /, i.e., agent 7 does not have a chance to 
influence agent i in this dynamic process at time ¢, we can 
constrain a;(t)=0. 


VII. 


We presented a stochastic linear system to represent the 
evolution of the contribution in online-postings. The 
contribution or engagement of an agent was modeled as a 
state variable in the online-forum. The EM algorithm 
identified the model parameters including the adjacency 
matrix of the graph constructed among participating agents 
along with measurement and system noise processes in 
online-postings. Our approach revealed the identified 
dynamical influences among agents in the time-varying 
shaping of the contribution in a data-driven fashion. We used 
the real-world data from online-postings to demonstrate the 
usefulness of our approach. The smoothed state variables 
match well with the observations. We also discussed how to 
predict the future states and to refine the model in order to 
synchronize and take into account sub-graphs. 


CONCLUSIONS 


Further research is needed in order to further address the 
connection between post frequency, network position, and 
influence on radicalization. Specifically, the smoothed state 
analyses suggest participation in threads is constant for a very 
small number of individuals. Instead, there appears to be 
some factor affecting the frequency of participation for 
several users which may peak at a specific point in time. For 
those individuals who appear to be on an upward 
participation trajectory, it is pivotal that we understand if this 
will be sustained or will lead to a sudden removal from the 
network for some reason. 


Further study is also needed to understand how the 
attitudes and beliefs of the individuals change during this 
period of increased participation toward more radical points 
of view. Linguistic and semantic analyses of posts and 
transitions in the extremist views presented are demanded in 
order to improve our understanding of the process of 
radicalization in on-line communications mediums. 


There is also a need to assess changes in the differences in 
posting frequencies and behaviors surrounding key events in 
the real world. For instance, if a major terror plot either 
succeeds or is foiled and receives major news coverage, it is 
plausible this would directly affect the quantity and quality of 
posts made by participants in various forums. This analysis 
did not control for real world events and their prospective 
influence on posting behaviors, though such analyses would 
benefit our knowledge of reciprocal relationship between 
virtual and real events [35]. 


In addition, this study could not account for the influence 
of private messages made between participants either within 
this forum or in external methods of communication, 
including email and other forms of CMC. These hidden 
communications may prove essential in explicating the role of 
individuals acting as hubs for radicalization messages. Such 
data may be difficult to identify and requires innovative 
collection strategies in order to be obtained [32, 33]. Open 
source databases, such as those used to address successful, 
failed, and foiled extremist plots may provide a resource to 


0 
0 


Fig. 4: Observations and smoothed states for agents 1-10 over iterations 
based on the identified system parameters via the EM algorithm: Blue dots 
denote the observations and orange dots denote the smoothed states. 


uncover hidden associations between actors and better inform 
the process of radicalization generally [44]. 


Finally, as this study focused on a far-right extremist group 
forum the findings may not be applicable to jihadist groups. 
Groups such as ISIS and Al-Qaeda appear to use CMCs, 
particularly social media, in more deliberate and strategic 
methods than far-right groups [23, 49, 52] . Future research is 
needed with robust comparative samples of posts from various 
online communications platforms made by these two 
movements in order to better compare their practices and 
organizational characteristics. In turn, we may be able to 
better document the structural differences in online networks 
and their role in radicalization. 


APPENDIX 


Kalman filter equations are standard as follows: 


1 
= Ox" 


t-1 
= Ge 
K = PM ’(MP"'M’+ Ry" 
xy = xy + K(y, -—Mx;") 


P' apr —KMP"' 


Smoothing Kalman filtering is needed to obtain matrices A, 
B, and C for the EM Algorithm. To this end, we perform the 
set of backward recursions fn, n-1, ..., 1 on the following 
equations [54]. 


J.-P oR) 
ae ee ee ay —@x! 
P"=P4J_((P"- PO) s. 


ee ae 


t-1,t-2 tt-1 


ei 
|= (-KM)@P"|| 


n 
pas n n n' 
A= Dy (AA 3 XX 4 
t=1 


t,t-1 ttl 


B= Ler: +x7x" 


C= Dy (P" +x"x!") 


After building the update matrices A B and C, an updated 
operator ® is obtained from 


@(r+1)=BA' 
along with updated covariance matrices 


O(r+1) =n" (C-BA"B’ 


RVr+1 =2"D(y, — Mx} Cy, — Mx’ + MPM’ 
t=1 
X(rt+l)= p Tr(P,, YI 
and vector 
X(r+)= Xen 


ACKNOWLEDGMENT 


This project was supported by Award No. 2014-ZA-BX- 
0004, awarded by the National Institute of Justice, Office of 
Justice Programs, U. S. Department of Justice. The opinions, 
findings, and conclusions or recommendations expressed in 
this publication are those of the authors and do not reflect those 
of the Department of Justice. 


REFERENCES 


[1] A. Silke, "Research on terrorism: A review of the impact of 9/11 
and the global war on terrorism," in Terrorism informatics: 
Knowledge management and data mining for homeland security, H. 
Chen, E. Reid, J. Sinai, A. Silke, and B. Ganor ,Eds., New York: 
Springer, 2008, pp. 27-50. 

[2] E. Bakker, Jihadi Terrorists in Europe, Their Characteristics and 
the Circumstances in Which They Joined the Jihad: An Exploratory 
Study. The Hague: Clingendael Institute, 2006. 

[3] R. Borum, "Radicalization into violent extremism I: A review of 
social science theories," J Strat Sec, vol. 4, pp. 7-36, 2011. 

[4] R. Borum, "Radicalization into violent extremism II: A review of 
conceptual models and empirical research’, J Strat Sec, vol. 4, pp. 
37-62, 2011. 

[5] M. Hamm, Terrorism as Crime: From Oklahoma City to Al 
Qaeda and Beyond. New York, NY: New York University Press, 
2007. 

[6] A. B. Krueger, What Makes a Terrorist: Economics and the Roots 
of Terrorism. Princeton, NJ: Princeton University Press, 2007. 

[7] C. McCauley, and S. Moskalenko, "Mechanisms of Political 
Radicalization: Pathways Toward Terrorism," Terror. Political 
Violence, vol. 20, pp. 415-433, 2008. 

[8] J. Monahan, "The individual risk assessment of terrorism," 
Psychol. Public Policy Law, vol. 18, p. 167, 2012. 

[9] M. Sageman, Understanding Terrorist Networks. University City, 
PA: University of Pennsylvania Press, 2004. 

[10] M. D. Silber, 7he Al Qaeda factor: plots against the West. 
University City, PA: University of Pennsylvania Press, 2011. 

[11] P. Simi,and R. Futrell, American Swastika: Inside the 
White Power Movement's Hidden Spaces of Hate. New York 
City: Rowman & Littlefield Publishers, 2010. 

[12] J. Stern, Terror in the Name of God: Why Religious Militants 
Kill. New York: HarperCollins Publishers Inc, 2003. 

[13] White House, Empowering Local Partners to Prevent Violent 
Extremism in the United States. Washington, D. C.: White House, 
2011. 

http://www. whitehouse.gov/sites/default/files/empowering_local_part 
ners.pdf 

[14] White House, Working to counter online radicalization to 
violence in the United States. Washington DC: White House, 2013. 
https://www.whitehouse.gov/blog/2013/02/05/working-counter- 
online-radicalization-violence-united-states 

[15] P. Jenkins, Pedophiles and Priests: Anatomy of a Contemporary 
Crisis. Oxford University Press, 2001.[16] M. T. Britz, "Terrorism 
and technology: Operationalizing cyberterrorism and identifying 
concepts,’ in Crime On-Line: Correlates, Causes, and Context, T. 
Holt, Eds., Raleigh, NC: Carolina Academic Press, 2010, pp. 193- 
220. 

[17]P. B. Gerstenfeld, D. R. Grant, and C. P. Chiang, "Hate online: 
A content analysis of extremist Internet sites," Anal Soc Issues Public 
Policy, vol. 3, pp. 29-44, 2003. 

[18] A. Goldsmith, and R. Brewer, "Digital drift and the criminal 
interaction order," Theoretical Criminology, vol. 19, pp. 112-130, 
2015. 

[19] G. Weimann, ""How modern terrorism uses the Internet," Int. 
Security, vol. 8, 2005. 

[20] G. Weimann, "Cyber-Fatwas and terrorism," Stud Conl Terror, 
vol. 34, pp. 765-781, 2011. 

[21] M., Gruen, "Innovative recruitment and indoctrination tactics by 
extremists: Video games, hip hop, and the World Wide Web," in The 
making of a terrorist, J. F. Forest, Ed., Westport, CT: Praeger, 
2005,pp. 67-89. 


[22] J. J. Forest, Influence warfare: How terrorists and governments 
stuggle to shape perceptions in a war of ideas. Westport, CT: Prager, 
2009. 
(23] J. Kunkle, "Social Media and the Homegrown Terrorist Threat," 
The Police Chief, vol. 79, 22, 2012. 

[24] C. McCauley, and S. Moskalenko, Friction: How Radicalization 
Happens to Them and Us. Oxford: Oxford University Press, 2011. 
[25] E. Lee, and L. Leets, "Persuasive storytelling by hate groups 
online: Examining its effects on adolescents," Am Beh Sci vol. 45, pp. 
927-957, 2002. 

[26] T. J. Holt, "Exploring strategies for qualitative criminological 
and criminal justice inquiry using on-line data," J. Crim. Jus. Ed., 
vol. 21, pp. 300-321, 2010. 

[27] E. Quayle, and M. Taylor, "Child pornography and the Internet: 
Perpetuating a cycle of abuse," Deviant Behav, vol. 23, pp. 331-361, 
2002. 

[28] J. F. Quinn, and C. J. Forsyth, "Describing sexual behavior in 
the era of the Internet: A typology for empirical research’, Deviant 
Behav, vol. 26, pp. 191-207, 2005. 

[29] B, Burkhalter, "Reading race online: Discovering racial identity 
in Usenet discussions," in Commmunities in Cyberspace, M. A. 
Smith, and P. Kollack, Eds., New York: Routledge, 1999, pp. 60-74. 
[30] B. Ebo, Cyberghetto or Cybertopia: Race, Class, and Gender on 
the Internet. Westport, CT: Praeger, 1998. 

[31] D. Miller, and D. Slater, The Internet: An Ethnographic 
Approach. Oxford: Berg, 2000. 

[32] B. Chu, T. J. Holt, and G. J. Ahn, Examining the Creation, 
Distribution, and Function of Malware On-Line. Technical Report 
for National Institute of Justice. NIJ Grant No. 2007-IJ-CX-0018, 
2010. 
(33] T. J. Holt, and E. Lampke, "Exploring stolen data markets on- 
line: Products and market forces," Crim. Just. St., vol. 23, pp. 33-50, 
2010. 
[34] D. Mann, and M. Sutton, "Netcrime: More changes in the 
organization of theiving," Brit Jour Crim, vol. 38, pp. 201-229, 1998 
[35] T. J. Holt, "Subcultural evolution? Examining the influence of 
on- and off-line experiences on deviant subcultures," J. Dev. Beh., 
vol. 28, pp. 171-198, 2007. 

[36] S. Chermak, Searching for a demon: The media's construction of 
the militia movement, Boston, MA: Northeastern University Press, 
2002 
(37] J. D. Freilich,, American militia: State level variations in militia 
activities, New York, NY: LFB Scholarly Publishing, LLC, 2003. 
(38] J. D. Freilich, N. A. P. Almanzar, and C. J. Rivera, "How social 
movement organizations explicitly and implicitly promote deviant 
behavior: The case of the militia movement," Justice Quarterly, vol. 
16, pp. 655-683, 1999. 

[39] J. Stern, Terror in the Name of God: Why Religious Militants 
Kill. Trenton, NJ: HarperCollins Publishers Inc, 2003. 

[40] M. Whine, "The use of the internet by far right extremists," In 
Cybercrime: Law, security, and privacy in the Information Age, B. 
Loader, and D. Thomas, Eds., New York: Routledge Press, 2000, pp. 
115-145. 

[41] Anti-Defamation League, Racist groups use computer gaming to 
promote violence against Blacks, Latinos, and Jews, New Yiork: 
Anti-Defamation League, 2004. 


[42] A. Corb, Into the minds of mayhem: White supremacy, 
recruitment, and the Internet, 2011. 

[43] B. Levin, "Cyberhate: A legal and historical analysis of 
extremists' use of computer networks in America," Am Beh Sci, vol. 
45, pp. 958-988, 2002. 

[44] T. J. Holt, "Exploring the Intersections of Technology, Crime 
and Terror," Ter Pol Vio, vol. 24, pp. 337-354, 2012. 

[45] H. Beirich, White homicide worldwide, Southern Poverty Law 
Center, 2014. 

[46] L. Bowman-Grieve, "Anti-abortion Extremism Online," First 


Monday: Peer-Reviewed Journal on the Internet, vol. 14, pp. 11, 
2009.[47] Chermak et al. 2013 


[48] S. Anahita, "Blogging the borders: Virtual skinheads, 
hypermasculinity, and heteronormativity," JPMS, vol. 34, pp. 143, 
2006. 
[49] E. Erez, G. Weimannand, and A. Weisburd, Jihad, crime, and 
the internet: Content analysis of Jihadist forum discussions. 
Washington DC: National Institute of Justice, 2011. 

[50] D. Decary-Hetu, and B. Dupont, “The social network of 
hackers.” GC, vol. 13, pp. 160-173, 2012. 

[51] R. M. Bakker, J. Raab, and H. B. Milward, "A preliminary 
theory of network resilience," Jour Pol Analysis Man, vol. 31, pp. 33- 
62, 2012. 


[52] C. Leuprecht, T. Hataley, S. Moskalenko, and C. McCauley, 
"Narratives and counter narratives for Global Jihad: Opinion versus 
action," in Countering violent extremist narratives, E.J.A.M., 
Kessels, Ed., Washington DC: National Coordinator for 
Counterterrorism (NCTb), 2010. 
http://english.nctb.nl/Images/Countering%20Violent%20Extremist% 
20Narratives_tcm92 -259489.pdf?cp=92 &cs=25496 

[53] R Olfati-Saber and R M. Murray, Consensus Problems in 
Networks of Agents with Switching Topology and Time-Delays, 
IEEE Transactions on Automatic Control, Vol. 49, No. 9, September 
2004, pp. 215-233. 

[54] R. H. Shumway and D. S. Stoffer An Approach to Time Series 
Smoothing and Forecasting Using the EM Algorithm, Journal of 
Time Series Analysis Vol. 3, No. 4,1982, pp. 253-264. 

[55] J. Leskovec, D. Huttenlocher, and J. Kleinberg, 2010, April. 
Predicting positive and negative links in online social networks. In 
Proceedings of the 19th international conference on World wide web 
(pp. 641-650). ACM. 

[56] J. Kunegis, A. Lommatzsch, and C. Bauckhage, 2009, April. 
The slashdot zoo: mining a social network with negative edges. In 


Proceedings of the 18th international conference on World wide web 
(pp. 741-750). ACM. 


